skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Huang, Yukun"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. We show that existing evaluations for assessing the factuality of news from conventional sources, such as claims on fact-checking websites, result in high accuracies over time for LLM-based detectors—even after their knowledge cutoffs. This suggests that recent popular false information from such sources can be easily identified due to its likely presence in pre-training/retrieval corpora or the emergence of salient, yet shallow, patterns in these datasets. Instead, we argue that a proper factuality evaluation dataset should test a model’s ability to reason about current events by retrieving and reading related evidence. To this end, we develop a novel pipeline that leverages natural language feedback from a RAG-based detector to iteratively modify real-time news into deceptive variants that challenge LLMs. Our iterative rewrite decreases the binary classification ROC-AUC by an absolute 17.5 percent for a strong RAG-based GPT-4o detector. Our experiments reveal the important role of RAG in both evaluating and generating challenging news examples, as retrieval-free LLM detectors are vulnerable to unseen events and adversarial attacks, while feedback from RAG-based evaluation helps discover more deceitful patterns. 
    more » « less
    Free, publicly-accessible full text available July 1, 2026
  2. Speculative Decoding (SD) enforces strict distributional equivalence to the target model when accepting candidate tokens. While it maintains the target model’s generation quality, this strict equivalence limits the speedup achievable by SD and prevents users from trading deviations from the target distribution in exchange for further inference speed gains. To address these limitations, we introduce Fuzzy Speculative Decoding (FSD) - a decoding algorithm that generalizes SD by accepting candidate tokens based on the divergences between the target and draft model distributions. By allowing for controlled divergence from the target model, FSD enables users to flexibly trade generation quality for inference speed. Across several benchmarks, our method is able to achieve significant runtime improvements of over 5 tokens per second faster than SD at only an approximate 2% absolute reduction in benchmark accuracy. In many cases, FSD is even able to match SD benchmark accuracy at over 2 tokens per second faster, demonstrating that distributional equivalence is not necessary to maintain target model performance. Furthermore, FSD can be seamlessly integrated into existing SD extensions; we demonstrate this by applying FSD to EAGLE-2, greatly enhancing this existing extension’s efficiency while allowing it to leverage FSD’s tunable quality-speed trade-off. 
    more » « less
    Free, publicly-accessible full text available July 1, 2026
  3. Large Language Models (LLMs) are often augmented with external contexts, such as those used in retrieval-augmented generation (RAG). However, these contexts can be inaccurate or intentionally misleading, leading to conflicts with the model’s internal knowledge. We argue that robust LLMs should demonstrate situated faithfulness, dynamically calibrating their trust in external information based on their confidence in the internal knowledge and the external context to resolve knowledge conflicts. To benchmark this capability, we evaluate LLMs across several QA datasets, including a newly created dataset featuring in-the-wild incorrect contexts sourced from Reddit posts. We show that when provided with both correct and incorrect contexts, both open-source and proprietary models tend to overly rely on external information, regardless of its factual accuracy. To enhance situated faithfulness, we propose two approaches: Self-Guided Confidence Reasoning (SCR) and Rule-Based Confidence Reasoning (RCR). SCR enables models to self-access the confidence of external information relative to their own internal knowledge to produce the most accurate answer. RCR, in contrast, extracts explicit confidence signals from the LLM and determines the final answer using predefined rules. Our results show that for LLMs with strong reasoning capabilities, such as GPT-4o and GPT-4o mini, SCR outperforms RCR, achieving improvements of up to 24.2% over a direct input augmentation baseline. Conversely, for a smaller model like Llama-3-8B, RCR outperforms SCR. Fine-tuning SCR with our proposed Confidence Reasoning Direct Preference Optimization (CR-DPO) method improves performance on both seen and unseen datasets, yielding an average improvement of 8.9% on Llama-3-8B. In addition to quantitative results, we offer insights into the relative strengths of SCR and RCR. Our findings highlight promising avenues for improving situated faithfulness in LLMs. 
    more » « less
    Free, publicly-accessible full text available June 1, 2026
  4. Abstract The detached trans-Neptunian objects (TNOs) are those with semimajor axes beyond the 2:1 resonance with Neptune that are neither resonant nor scattering. Using the detached sample from the Outer Solar System Origins Survey (OSSOS) telescopic survey, we produce the first studies of their orbital distribution based on matching the orbits and numbers of the known TNOs after accounting for survey biases. We show that the detached TNO perihelion ( q ) distribution cannot be uniform but is instead better matched by two uniform components with a break near q ≈ 40 au. We produce parametric two-component models that are not rejectable by the OSSOS data set and estimate that there are 36,000 − 9000 + 12 , 000 detached TNOs with absolute magnitudes H r < 8.66 ( D ≳ 100 km) and semimajor axes 48 au < a < 250 au (95% confidence limits). Although we believe that these heuristic two-parameter models yield a correct population estimate, we then use the same methods to show that the perihelion distribution of a detached disk created by a simulated rogue planet matches the q distribution even better, suggesting that the temporary presence of other planets in the early solar system is a promising model to create today’s large semimajor axis TNO population. This cosmogonic simulation results in a detached TNO population estimate of 48,000 − 12 , 000 + 15 , 000 . Because this illustrates how difficult-to-detect q > 50 au objects are likely present, we conclude that there are (5 ± 2) × 10 4 dynamically detached TNOs, roughly twice as many as in the entire trans-Neptunian hot main belt. 
    more » « less
  5. Abstract There is a complex inclination structure present in the trans-Neptunian object (TNO) orbital distribution in the main classical-belt region (between orbital semimajor axes of 39 and 48 au). The long-term gravitational effects of the giant planets make TNO orbits precess, but nonresonant objects maintain a nearly constant “free” inclination (Ifree) with respect to a local forced precession pole. Because of the likely cosmogonic importance of the distribution of this quantity, we tabulate free inclinations for all main-belt TNOs, each individually computed using barycentric orbital elements with respect to each object’s local forcing pole. We show that the simplest method, based on the Laplace–Lagrange secular theory, is unable to give correct forcing poles for objects near theν18secular resonance, resulting in poorly conservedIfreevalues in much of the main belt. We thus instead implemented an averaged Hamiltonian to obtain the expected nodal precession for each TNO, yielding significantly more accurate free inclinations for nonresonant objects. For the vast majority (96%) of classical-belt TNOs, theseIfreevalues are conserved to < 1° over 4 Gyr numerical simulations, demonstrating the advantage of using this well-conserved quantity in studies of the TNO population and its primordial inclination profile; our computed distributions only reinforce the idea of a very coplanar surviving “cold” primordial population, overlain by a largeI-width implanted “hot” population. 
    more » « less